Methods and Compositions for Haploid Mapping

ABSTRACT

The present invention relates to the field of plant breeding. More specifically, the present invention includes a method of using haploid plants for genetic mapping of traits of interest such as disease resistance. Further, the invention includes a method for breeding corn plants containing quantitative trait loci (QTL) that are associated with resistance to Gray Leaf Spot (GLS), a fungal disease associated with  Cercospora  spp. The invention further includes a method for breeding corn plants containing QTL that are associated with Goss&#39; Wilt, a bacterial disease associated with  Clavibacter michiganense  spp.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to U.S. Provisional Patent Application No. 60/966,706, filed Aug. 29, 2007 and incorporated herein by reference in its' entirety.

INCORPORATION OF SEQUENCE LISTING

A sequence listing contained in the file named “46_(—)25(54886_(—)0001_US).txt” which is 2432225 bytes (measured in MS-Windows®), created on Aug. 21, 2008, and comprising 1,361 nucleotide sequences, is electronically filed herewith and is incorporated herein by reference in its entirety.

FIELD OF THE INVENTION

The present invention relates to the field of plant breeding. More specifically, the present invention includes a method of using haploid plants for genetic mapping of traits such as disease resistance. Further, the invention includes a method for breeding corn plants containing quantitative trait loci (QTL) that are associated with resistance to gray leaf spot (GLS), a fungal disease associated with Cercospora spp. The invention further includes a method for breeding corn plants containing QTL that are associated with Goss' Wilt, a bacterial disease associated with Clavibacter michiganense spp.

BACKGROUND OF THE INVENTION

Plant breeding is greatly facilitated by the use of doubled haploid (DH) plants. The production of DH plants enables plant breeders to obtain inbred lines without multigenerational inbreeding, thus decreasing the time required to produce homozygous plants. DH plants provide an invaluable tool to plant breeders, particularly for generating inbred lines, QTL mapping, cytoplasmic conversions, and trait introgression. A great deal of time is spared as homozygous lines are essentially instantly generated, negating the need for multigenerational conventional inbreeding.

In particular, because DH plants are entirely homozygous, they are very amenable to quantitative genetics studies. Both additive variance and additive×additive genetic variances can be estimated from DH populations. Other applications include identification of epistasis and linkage effects. For plant breeders, DH populations have been particularly useful in QTL mapping, cytoplasmic conversions, and trait introgression. Moreover, there is value in testing and evaluating homozygous lines for plant breeding programs. All of the genetic variance is among progeny in a breeding cross, which improves selection gain.

Methods of utilizing haploids in genetic studies have been described in the art. A statistical method to utilize pooled haploid DNA to estimate parental linkage phase and to construct genetic linkage maps has been described (Gasbarra, D. et al., Genetics 172: 1325-1335 (2006)). An additional study has used the method of crossing haploid wheat plants with cultivars to map leaf rust resistance gene in wheat (Hiebert, C. et al., Theor Appl Genet 110: 1453-1457 (2005)). Haploid plants and SSR markers have been used in linkage map construction of cotton (Song, X. et al., Genome 48:378-392 (2005)). Further, AFLP marker ANALYSIS has been performed in monoploid potato (Varrieur, J., Thesis, AFLP Marker Analysis of Monoploid Potato (2002) To date a method of using haploid plants to genetically map loci associated with traits of interest is lacking. The present invention provides a method of using haploid plants to genetically map traits of interest.

Two diseases which cause significant damage to corn crops are Gray Leaf Spot (GLS) caused by the fungal pathogen Cercospora zeae-maydis (CZ) and Goss' Wilt caused by the bacterial pathogen Clavibacter michiganensis subsp. nebraskensis (CN). GLS is a global problem and, in addition to prevalence in Africa, Central America and South America, it has spread across most of the U.S. Corn Belt over the past 10-15 years. The fungus overwinters in field debris and requires moisture, usually in the form of heavy fogs, dews, or rains, to spread its spores and infect corn. Increasing pervasiveness has been linked to no-till practices which promote retention of fungi, such as CZ, in the soil (Paul et al., Phytopathology 95:388-396 (2005)). Symptoms include a rectangular necrotic lesion which can coalesce to larger affected regions and symptoms usually appear later in the growing season. GLS in corn elicits an increased allocation of plant resources to damaged leaf tissue, leading to elevated risk for root and stalk rots, which ultimately results in even greater crop losses (Ward et al. 1999; Saghai-Maroof et al., Theor. Appl. Genet. 93:539-546 (1996)). Yield-loss associated with GLS can be high if the symptoms are heavy and appear early, with reported losses exceeding 50% (Ward et al., 1999). Recent work has identified there are at least two sister species of CZ, as well as potentially other isolates of Cercospora, capable of causing GLS (Carson et al., Maydica 51:89-92 (2006); Carson et al, Plant Dis. 86:1088-109 (2002)). Genomic regions on maize Chromosomes 1, 2, 3, 4, 5, 6, 7, and 8 have been associated with GLS using RFLP, AFLP and SSR markers (U.S. Pat. No. 5,574,210; Lehmensiek, et al., TAG, (2001); Clements, et al. Phytopathology (2000); Gorden et al., Crop Science (2004); Bubeck, et al., Crop Science, (1993); Saghai-Maroof et al., Theor. Appl. Genet (1996)).

Goss' Wilt is another disease of corn which has been identified throughout the U.S. Corn Belt, primarily in the western regions. Symptoms include leaf freckles which are small dark green to black water soaked spots and vascular wilt which results in loss of yield. Conservation tillage practices can increase pervasiveness because the bacterial pathogen Clavibacter michiganensis subsp. nebraskensis (CN) can overwinter in debris, particularly stalks, from infected corn plants (Bradbury, J. F. IMI description of Fungi and Bacteria, (1998)). A mapping study conducted by Rocheford et al., reported a genomic region on maize Chromosome 4 associated with Goss' Wilt (Rocheford, et al., Journal of Heredity 80(5), (1989)). Both GLS and Goss' Wilt are significant pathogens of corn, and a need exists for development of disease resistant lines.

Breeding for corn plants resistant to GLS and Goss' Wilt can be greatly facilitated by the use of marker-assisted selection. Of the classes of genetic markers, single nucleotide polymorphisms (SNPs) have characteristics which make them preferential to other genetic markers in detecting, selecting for, and introgressing disease resistance in a corn plant. SNPs are preferred because technologies are available for automated, high-throughput screening of SNP markers, which can decrease the time to select for and introgress disease resistance in corn plants. Further, SNP markers are ideal because the likelihood that a particular SNP allele is derived from independent origins in the extant population of a particular species is very low. As such, SNP markers are useful for tracking and assisting introgression of disease resistance alleles, particularly in the case of disease resistance haplotypes.

SUMMARY OF THE INVENTION

In certain embodiments, methods for the association of at least one genotype with at least one phenotype using a haploid plant comprising: a) assaying at least one genotype of at least one haploid plant with at least one genetic marker; and b) associating the at least one marker with at least one phenotypic trait are provided. In certain embodiments, the at least one genetic marker comprises a single nucleotide polymorphism (SNP), an insertion or deletion in DNA sequence (Indel), a simple sequence repeat of DNA sequence (SSR) a restriction fragment length polymorphism, a haplotype, or a tag SNP. In other embodiments, the methods can further comprise the step of using an association determined in step (b) to make a selection in a plant breeding program. In such embodiments comprising a selection, the selection can comprise any one or all of: 1) selecting among breeding populations based on the at least one genotype; 2) selecting progeny in one or more breeding populations based on the at least one genotype; 3) selecting among parental lines based on prediction of progeny performance; 4) selecting a line for advancement in a germplasm improvement activity based on the at least one genotype; and/or 5) selecting a line for advancement in a germplasm improvement activity where the germplasm improvement activity is selected from the group consisting of line development, variety development, hybrid development, transgenic event selection, making breeding crosses, testing and advancing a plant through self fertilization, purification of lines or sublines, using plants or parts thereof for transformation, using plants or parts thereof for candidates for expression constructs, and using plants or parts thereof for mutagenesis. In certain embodiments, the methods can further comprise the step of doubling at least one haploid plant selected in said breeding program to obtain a doubled haploid plant. In such embodiments where a doubled haploid plant is obtained, the doubled haploid plant can be used for introgression of the genotype of interest into at least a second plant for use in a plant breeding program. In certain embodiments, the haploid plant in step (a) is obtained from a haploid breeding population. In certain embodiments, the haploid plant or plants comprise an intact plant, a leaf, vascular tissue, flower, pod, root, stem, seed or portion thereof. In certain embodiments, the plants are selected from the group consisting of maize (Zea mays), soybean (Glycine max), cotton (Gossypium hirsutum), peanut (Arachis hypogaea), barley (Hordeum vulgare); oats (Avena sativa); orchard grass (Dactylis glomerata); rice (Oryza sativa, including indica and japonica varieties); sorghum (Sorghum bicolor); sugar cane (Saccharum sp); tall fescue (Festuca arundinacea); turfgrass species (e.g. species: Agrostis stolonifera, Poa pratensis, Stenotaphrum secundatum); wheat (Triticum aestivum), and alfalfa (Medicago sativa), members of the genus Brassica, carrot, cucumber, dry bean, eggplant, fennel, garden beans, gourd, leek, lettuce, melon, okra, onion, pea, pepper, pumpkin, radish, spinach, squash, sweet corn, tomato, watermelon, and ornamental plants. In certain embodiments, the haploid plant is a fruit, vegetable, tuber, or root crop. In certain embodiments, the trait is selected from the group consisting of herbicide tolerance, disease resistance, insect or pest resistance, altered fatty acid, protein or carbohydrate metabolism, increased grain yield, increased oil, enhanced nutritional content, increased growth rates, enhanced stress tolerance, preferred maturity, enhanced organoleptic properties, altered morphological characteristics, sterility, a trait for industrial use, and a trait for consumer appeal.

In certain embodiments, methods for identifying an association of a plant genotype with one or more traits of interest comprising: a) screening a plurality of haploid plants displaying heritable variation for at least one trait wherein the heritable variation is linked to at least one genotype; and b) associating at least one genotype of at least one haploid plant to at least one trait are provided. In certain embodiments, the genotype comprises a genetic marker. In certain embodiments, the genetic marker comprises a single nucleotide polymorphism (SNP), an insertion or deletion in DNA sequence (Indel), a simple sequence repeat of DNA sequence (SSR) a restriction fragment length polymorphism, a haplotype, or a tag SNP. In certain embodiments, the methods can further comprising the step of using an association determined in step (b) to make a selection in a plant breeding program. In such embodiments comprising a selection, the selection can comprise any one or all of: 1) selecting among breeding populations based on the at least one genotype; 2) selecting progeny in one or more breeding populations based on the at least one genotype; 3) selecting among parental lines based on prediction of progeny performance; 4) selecting a line for advancement in a germplasm improvement activity based on the at least one genotype; and/or 5) selecting a line for advancement in a germplasm improvement activity where the germplasm improvement activity is selected from the group consisting of line development, variety development, hybrid development, transgenic event selection, making breeding crosses, testing and advancing a plant through self fertilization, purification of lines or sublines, using plants or parts thereof for transformation, using plants or parts thereof for candidates for expression constructs, and using plants or parts thereof for mutagenesis. In certain embodiments, the methods can further comprise the step of doubling at least one haploid plant selected in the breeding program to obtain a doubled haploid plant. In certain embodiments, the doubled haploid plant is used for introgressing the genotype of interest into a plant for use in a plant breeding program. In certain embodiments, the haploid plant or plants comprise an intact plant, a leaf, vascular tissue, flower, pod, root, stem, seed or portion thereof. In certain embodiments, the plants are selected from the group consisting of maize (Zea mays), soybean (Glycine max), cotton (Gossypium hirsutum), peanut (Arachis hypogaea), barley (Hordeum vulgare); oats (Avena sativa); orchard grass (Dactylis glomerata); rice (Oryza sativa, including indica and japonica varieties); sorghum (Sorghum bicolor); sugar cane (Saccharum sp); tall fescue (Festuca arundinacea); turfgrass species (e.g. species: Agrostis stolonifera, Poa pratensis, Stenotaphrum secundatum); wheat (Triticum aestivum), and alfalfa (Medicago sativa), members of the genus Brassica, carrot, cucumber, dry bean, eggplant, fennel, garden beans, gourd, leek, lettuce, melon, okra, onion, pea, pepper, pumpkin, radish, spinach, squash, sweet corn, tomato, watermelon, and ornamental plants. In certain embodiments, the haploid plant is a fruit, vegetable, tuber, or root crop. In certain embodiments, the trait is selected from the group consisting of herbicide tolerance, disease resistance, insect or pest resistance, altered fatty acid, protein or carbohydrate metabolism, increased grain yield, increased oil, enhanced nutritional content, increased growth rates, enhanced stress tolerance, preferred maturity, enhanced organoleptic properties, altered morphological characteristics, sterility, a trait for industrial use, and a trait for consumer appeal.

In certain embodiments, methods for the association of at least one phenotype with at least one genetic marker using a haploid plant comprising: a) assaying at least one phenotype of at least one haploid plant with at least one phenotypic marker to determine the presence or absence of said phenotype; and b) associating the presence or absence of said phenotype with at least one genetic marker are provided. In certain embodiments of the methods, the haploid plant is obtained from a haploid breeding population. In certain embodiments of the methods, the at least one genetic marker can comprise a single nucleotide polymorphism (SNP), an insertion or deletion in DNA sequence (Indel), a simple sequence repeat of DNA sequence (SSR) a restriction fragment length polymorphism, a haplotype, or a tag SNP. In certain embodiments of the methods, the at least one phenotypic marker can comprise at least one of a transcriptional profile, a metabolic profile, a nutrient composition profile, a protein expression profile, protein composition, protein levels, oil composition, oil levels, carbohydrate composition, carbohydrate levels, fatty acid composition, fatty acid levels, amino acid composition, amino acid levels, biopolymers, pharmaceuticals, starch composition, starch levels, fermentable starch, fermentation yield, fermentation efficiency, energy yield, secondary compounds, metabolites, morphological characteristics, or an agronomic characteristic. In certain embodiments of these methods, the methods can further comprising the step of using an association determined in step (b) to make a selection in a plant breeding program. In certain embodiments comprising a selection, the selection can comprise any one or all of: 1) selecting among breeding populations based on the at least one genotype; 2) selecting progeny in one or more breeding populations based on the at least one genotype; 3) selecting among parental lines based on prediction of progeny performance; 4) selecting a line for advancement in a germplasm improvement activity based on the at least one genotype; and/or 5) selecting a line for advancement in a germplasm improvement activity where the germplasm improvement activity is selected from the group consisting of line development, variety development, hybrid development, transgenic event selection, making breeding crosses, testing and advancing a plant through self fertilization, purification of lines or sublines, using plants or parts thereof for transformation, using plants or parts thereof for candidates for expression constructs, and using plants or parts thereof for mutagenesis. In certain embodiments of these methods, the methods can further comprise the step of doubling at least one haploid plant selected in said breeding program to obtain a doubled haploid plant. In certain embodiments comprising obtainment of a doubled haploid plant, the doubled haploid plant is used for introgression of the genotype of interest into at least a second plant for use in a plant breeding program.

DETAILED DESCRIPTION OF THE INVENTION Definitions

The definitions and methods provided herein define the present invention and guide those of ordinary skill in the art in the practice of the present invention. Unless otherwise noted, terms are to be understood according to conventional usage by those of ordinary skill in the relevant art. Definitions of common terms in molecular biology may also be found in Alberts et al., Molecular Biology of The Cell, 3^(rd) Edition, Garland Publishing, Inc.: New York, 1994; Rieger et al., Glossary of Genetics: Classical and Molecular, 5th Edition, Springer-Verlag: New York, 1991; and Lewin, Genes V, Oxford University Press New York, 1994. The nomenclature for DNA bases as set forth at 37 CFR § 1.822 is used.

As used herein, a “locus” is a fixed position on a chromosome and may represent a single nucleotide, a few nucleotides or a large number of nucleotides in a genomic region.

As used herein, “polymorphism” means the presence of one or more variations of a nucleic acid sequence at one or more loci in a population of one or more individuals. The variation may comprise but is not limited to, one or more base changes, the insertion of one or more nucleotides or the deletion of one or more nucleotides. A polymorphism includes a single nucleotide polymorphism (SNP), a simple sequence repeat (SSR) and indels, which are insertions and deletions. A polymorphism may arise from random processes in nucleic acid replication, through mutagenesis, as a result of mobile genomic elements, from copy number variation and during the process of meiosis, such as unequal crossing over, genome duplication and chromosome breaks and fusions. The variation can be commonly found or may exist at low frequency within a population, the former having greater utility in general plant breeding and the later may be associated with rare but important phenotypic variation.

As used herein, “marker” means a detectable characteristic that can be used to discriminate between organisms. Examples of such characteristics may include genetic markers, protein composition, protein levels, oil composition, oil levels, carbohydrate composition, carbohydrate levels, fatty acid composition, fatty acid levels, amino acid composition, amino acid levels, biopolymers, pharmaceuticals, starch composition, starch levels, fermentable starch, fermentation yield, fermentation efficiency, energy yield, secondary compounds, metabolites, morphological characteristics, and agronomic characteristics.

As used herein, “genetic marker” means polymorphic nucleic acid sequence or nucleic acid feature. A “polymorphism” is a variation among individuals in sequence, particularly in DNA sequence, or feature, such as a transcriptional profile or methylation pattern. Useful polymorphisms include single nucleotide polymorphisms (SNPs), insertions or deletions in DNA sequence (Indels), simple sequence repeats of DNA sequence (SSRs) a restriction fragment length polymorphism, a haplotype, and a tag SNP. A genetic marker, a gene, a DNA-derived sequence, a RNA-derived sequence, a promoter, a 5′ untranslated region of a gene, a 3′ untranslated region of a gene, microRNA, siRNA, a QTL, a satellite marker, a transgene, mRNA, ds mRNA, a transcriptional profile, and a methylation pattern may comprise polymorphisms.

As used herein, “marker assay” means a method for detecting a polymorphism at a particular locus using a particular method, e.g. measurement of at least one phenotype (such as seed color, flower color, or other visually detectable trait), restriction fragment length polymorphism (RFLP), single base extension, electrophoresis, sequence alignment, allelic specific oligonucleotide hybridization (ASO), random amplified polymorphic DNA (RAPD), microarray-based technologies, and nucleic acid sequencing technologies, etc.

As used herein, the phrase “immediately adjacent”, when used to describe a nucleic acid molecule that hybridizes to DNA containing a polymorphism, refers to a nucleic acid that hybridizes to DNA sequences that directly abut the polymorphic nucleotide base position. For example, a nucleic acid molecule that can be used in a single base extension assay is “immediately adjacent” to the polymorphism.

As used herein, “interrogation position” refers to a physical position on a solid support that can be queried to obtain genotyping data for one or more predetermined genomic polymorphisms.

As used herein, “consensus sequence” refers to a constructed DNA sequence which identifies SNP and Indel polymorphisms in alleles at a locus. Consensus sequence can be based on either strand of DNA at the locus and states the nucleotide base of either one of each SNP in the locus and the nucleotide bases of all Indels in the locus. Thus, although a consensus sequence may not be a copy of an actual DNA sequence, a consensus sequence is useful for precisely designing primers and probes for actual polymorphisms in the locus.

As used herein, the term “single nucleotide polymorphism,” also referred to by the abbreviation “SNP,” means a polymorphism at a single site wherein said polymorphism constitutes a single base pair change, an insertion of one or more base pairs, or a deletion of one or more base pairs.

As used herein, “genotype” means the genetic component of the phenotype and it can be indirectly characterized using markers or directly characterized by nucleic acid sequencing. Suitable markers include a phenotypic character, a metabolic profile, a genetic marker, or some other type of marker. A genotype may constitute an allele for at least one genetic marker locus or a haplotype for at least one haplotype window. In some embodiments, a genotype may represent a single locus and in others it may represent a genome-wide set of loci. In another embodiment, the genotype can reflect the sequence of a portion of a chromosome, an entire chromosome, a portion of the genome, and the entire genome.

As used herein, the term “haplotype” means a chromosomal region within a haplotype window defined by at least one polymorphic molecular marker. The unique marker fingerprint combinations in each haplotype window define individual haplotypes for that window. Further, changes in a haplotype, brought about by recombination for example, may result in the modification of a haplotype so that it comprises only a portion of the original (parental) haplotype operably linked to the trait, for example, via physical linkage to a gene, QTL, or transgene. Any such change in a haplotype would be included in our definition of what constitutes a haplotype so long as the functional integrity of that genomic region is unchanged or improved.

As used herein, the term “haplotype window” means a chromosomal region that is established by statistical analyses known to those of skill in the art and is in linkage disequilibrium. Thus, identity by state between two inbred individuals (or two gametes) at one or more molecular marker loci located within this region is taken as evidence of identity-by-descent of the entire region. Each haplotype window includes at least one polymorphic molecular marker. Haplotype windows can be mapped along each chromosome in the genome. Haplotype windows are not fixed per se and, given the ever-increasing density of molecular markers, this invention anticipates the number and size of haplotype windows to evolve, with the number of windows increasing and their respective sizes decreasing, thus resulting in an ever-increasing degree of confidence in ascertaining identity by descent based on the identity by state at the marker loci.

As used herein, a plant referred to as “haploid” has a single set (genome) of chromosomes and the reduced number of chromosomes (n) in the haploid plant is equal to that of the gamete.

As used herein, a plant referred to as “doubled haploid” is developed by doubling the haploid set of chromosomes. A plant or seed that is obtained from a doubled haploid plant that is selfed any number of generations may still be identified as a doubled haploid plant. A doubled haploid plant is considered a homozygous plant. A plant is considered to be doubled haploid if it is fertile, even if the entire vegetative part of the plant does not consist of the cells with the doubled set of chromosomes; that is, a plant will be considered doubled haploid if it contains viable gametes, even if it is chimeric.

As used herein, a plant referred to as “diploid” has two sets (genomes) of chromosomes and the chromosome number (2n) is equal to that of the zygote.

As used herein, the term “plant” includes whole plants, plant organs (i.e., leaves, stems, roots, etc.), seeds, and plant cells and progeny of the same. “Plant cell” includes without limitation seeds, suspension cultures, embryos, meristematic regions, callus tissue, leaves, shoots, gametophytes, sporophytes, pollen, and microspores.

As used herein, a “genetic map” is the ordered list of loci known for a particular genome.

As used herein, “phenotype” means the detectable characteristics of a cell or organism which are a manifestation of gene expression.

As used herein, a “phenotypic marker” refers to a marker that can be used to discriminate phenotypes displayed by organisms.

As used herein, “linkage” refers to relative frequency at which types of gametes are produced in a cross. For example, if locus A has genes “A” or “a” and locus B has genes “B” or “b” and a cross between parent I with AABB and parent B with aabb will produce four possible gametes where the genes are segregated into AB, Ab, aB and ab. The null expectation is that there will be independent equal segregation into each of the four possible genotypes, i.e. with no linkage ¼ of the gametes will of each genotype. Segregation of gametes into a genotypes differing from ¼ are attributed to linkage.

As used herein, “linkage disequilibrium” is defined in the context of the relative frequency of gamete types in a population of many individuals in a single generation. If the frequency of allele A is p, a is p′, B is q and b is q′, then the expected frequency (with no linkage disequilibrium) of genotype AB is pq, Ab is pq′, aB is p′q and ab is p′q′. Any deviation from the expected frequency is called linkage disequilibrium. Two loci are said to be “genetically linked” when they are in linkage disequilibrium.

As used herein, “quantitative trait locus (QTL)” means a locus that controls to some degree numerically representable traits that are usually continuously distributed.

As used herein, the term “transgene” means nucleic acid molecules in form of DNA, such as cDNA or genomic DNA, and RNA, such as mRNA or microRNA, which may be single or double stranded.

As used herein, the term “inbred” means a line that has been bred for genetic homogeneity.

As used herein, the term “hybrid” means a progeny of mating between at least two genetically dissimilar parents. Without limitation, examples of mating schemes include single crosses, modified single cross, double modified single cross, three-way cross, modified three-way cross, and double cross wherein at least one parent in a modified cross is the progeny of a cross between sister lines.

As used herein, the term “tester” means a line used in a testcross with another line wherein the tester and the lines tested are from different germplasm pools. A tester may be isogenic or nonisogenic.

As used herein, “resistance allele” means the isolated nucleic acid sequence that includes the polymorphic allele associated with resistance to the disease or condition of concern.

As used herein, the term “corn” means Zea mays or maize and includes all plant varieties that can be bred with corn, including wild maize species.

As used herein, the term “comprising” means “including but not limited to”.

As used herein, an “elite line” is any line that has resulted from breeding and selection for superior agronomic performance.

As used herein, an “inducer” is a line which when crossed with another line promotes the formation of haploid embryos.

As used herein, “haplotype effect estimate” means a predicted effect estimate for a haplotype reflecting association with one or more phenotypic traits, wherein the associations can be made de novo or by leveraging historical haplotype-trait association data.

As used herein, “breeding value” means a calculation based on nucleic acid sequence effect estimates and nucleic acid sequence frequency values, the breeding value of a specific nucleic acid sequence relative to other nucleic acid sequences at the same locus (i.e., haplotype window), or across loci (i.e., haplotype windows), can also be determined. In other words, the change in population mean by fixing said nucleic acid sequence is determined. In addition, in the context of evaluating the effect of substituting a specific region in the genome, either by introgression or a transgenic event, breeding values provide the basis for comparing specific nucleic acid sequences for substitution effects. Also, in hybrid crops, the breeding value of nucleic acid sequences can be calculated in the context of the nucleic acid sequence in the tester used to produce the hybrid.

To the extent to which any of the preceding definitions is inconsistent with definitions provided in any patent or non-patent reference incorporated herein or in any reference found elsewhere, it is understood that the preceding definition will be used herein.

Haploid Mapping

Induction of haploidization followed by diplodization requires a high input of resources. Diploidization represents a rate-limiting step as it is expensive and requires a high input of labor as well as plant material in order to generate sufficient breeding material. The present invention includes methods for the use of homozygous plant material for quantitative genetic studies. Significant time and resources can be saved by using haploid plants for QTL mapping. These plants have only one parental set of chromosomes and thus are hemizygous for all genes in their genome. This property allows for a resolution in genetic mapping which is similar to that of recombinant inbred lines (RILs) with the advantage that haploid plants can be produced in only one growing season. Further, the present invention provides an increased efficiency in allocation of diploidization resources as only those haploid plants with at least one QTL of interest can be advanced for doubling.

In certain embodiments, the present invention comprises identification and introgression of QTL associated with desirable traits using haploid plants in a plant breeding program. In one aspect, the present invention includes methods and compositions for mapping disease resistance loci in corn.

The present invention provides a method of using haploid plants to identify genotypes associated with phenotypes of interest wherein the haploid plant is assayed with at least one marker and associating the at least one marker with at least one phenotypic trait. The genotype of interest can then be used to make decisions in a plant breeding program. Such decisions include, but are not limited to, selecting among new breeding populations which population has the highest frequency of favorable nucleic acid sequences based on historical genotype and agronomic trait associations, selecting favorable nucleic acid sequences among progeny in breeding populations, selecting among parental lines based on prediction of progeny performance, and advancing lines in germplasm improvement activities based on presence of favorable nucleic acid sequences. Non-limiting examples of germplasm improvement activities include line development, hybrid development, transgenic event selection, making breeding crosses, testing and advancing a plant through self fertilization, using plants for transformation, using plants for candidates for expression constructs, and using plants for mutagenesis.

Non-limiting examples of breeding decisions include progeny selection, parent selection, and recurrent selection for at least one haplotype. In another aspect, breeding decisions relating to development of plants for commercial release comprise advancing plants for testing, advancing plants for purity, purification of sublines during development, inbred development, variety development, and hybrid development. In yet other aspects, breeding decisions and germplasm improvement activities comprise transgenic event selection, making breeding crosses, testing and advancing a plant through self-fertilization, using plants for transformation, using plants for candidates for expression constructs, and using plants for mutagenesis.

In still another embodiment, the present invention acknowledges that preferred haplotypes and QTL identified by the methods presented herein may be advanced as candidate genes for inclusion in expression constructs, i.e., transgenes. Nucleic acids underlying haplotypes or QTL of interest may be expressed in plant cells by operably linking them to a promoter functional in plants. In another aspect, nucleic acids underlying haplotypes or QTL of interest may have their expression modified by double-stranded RNA-mediated gene suppression, also known as RNA interference (“RNAi”), which includes suppression mediated by small interfering RNAs (“siRNA”), trans-acting small interfering RNAs (“ta-siRNA”), or microRNAs (“miRNA”). Examples of RNAi methodology suitable for use in plants are described in detail in U.S. Patent Application Publications 2006/0200878 and 2007/0011775.

Methods are known in the art for assembling and introducing constructs into a cell in such a manner that the nucleic acid molecule for a trait is transcribed into a functional mRNA molecule that is translated and expressed as a protein product. For the practice of the present invention, conventional compositions and methods for preparing and using constructs and host cells are well known to one skilled in the art, see for example, Molecular Cloning: A Laboratory Manual, 3rd Edition Volumes 1, 2, and 3 (2000) J. F. Sambrook, D. W. Russell, and N. Irwin, Cold Spring Harbor Laboratory Press. Methods for making transformation constructs particularly suited to plant transformation include, without limitation, those described in U.S. Pat. Nos. 4,971,908, 4,940,835, 4,769,061 and 4,757,011, all of which are herein incorporated by reference in their entirety. Transformation methods for the introduction of expression units into plants are known in the art and include electroporation as illustrated in U.S. Pat. No. 5,384,253; microprojectile bombardment as illustrated in U.S. Pat. Nos. 5,015,580; 5,550,318; 5,538,880; 6,160,208; 6,399,861; and 6,403,865; protoplast transformation as illustrated in U.S. Pat. No. 5,508,184; and Agrobacterium-mediated transformation as illustrated in U.S. Pat. Nos. 5,635,055; 5,824,877; 5,591,616; 5,981,840; and 6,384,301.

The method of the present invention can be used to identify genotypes associated with phenotypes of interest such as those associated with disease resistance, herbicide tolerance, insect or pest resistance, altered fatty acid, protein or carbohydrate metabolism, increased grain yield, increased oil, enhanced nutritional content, increased growth rates, enhanced stress tolerance, preferred maturity, enhanced organoleptic properties, altered morphological characteristics, sterility, other agronomic traits, traits for industrial uses, or traits for consumer appeal.

Production of DH plants, which entails induction of haploidization followed by diploidization, requires a high input of resources. DH plants rarely occur naturally; therefore, artificial means of production are used. First, one or more lines are crossed with an inducer parent to produce haploid seed. Inducer lines for maize include Stock 6, RWS, KEMS, KMS and ZMS, and indeterminate gametophyte (ig) mutation. Selection of haploid seed can be accomplished by various screening methods based on phenotypic or genotypic characteristics. In one approach, material is screened with visible marker genes, including GFP, GUS, anthocyanin genes such as R-nj, luciferase, YFP, CFP, or CRC, that are only induced in the endosperm cells of haploid cells, allowing for separation of haploid and diploid seed. Other screening approaches include chromosome counting, flow cytometry, genetic marker evaluation to infer copy number, etc.

Resulting haploid seed has a haploid embryo and a normal triploid endosperm. There are several approaches known in the art to achieve chromosome doubling. Haploid cells, haploid embryos, haploid seeds, haploid seedlings, or haploid plants can be treated with a doubling agent. Non-limiting examples of known doubling agents include nitrous oxide gas, anti-microtubule herbicides, anti-microtubule agents, colchicine, pronamide, and mitotic inhibitors.

The present invention provides and includes a method for using haploid plants to map and fine-map QTL associated with a trait such as disease resistance in plants. The present invention also provides and includes a method for screening and selecting a corn plant comprising QTL for Gray Leaf Spot (GLS) resistance using endemic strains of CZ and single nucleotide polymorphisms (SNP) marker technology. The present invention further provides and includes a method for screening and selecting a corn plant comprising QTL for Goss' Wilt resistance using endemic strains of CN and SNP marker technology.

The present invention provides corn genomic DNA markers associated with GLS resistance. SEQ ID NOs: 36, 421, 481, 659, 1127, 1219, 1228, 1229, 1230, 1231, 1232, and 1233 were associated with GLS resistance by the methods of the present invention.

The present invention further provides corn genomic DNA markers associated with Goss' Wilt resistance. SEQ ID NOs: 24, 158, 218, 234, 236, 272, 368, 371, 375, 401, 408, 440, 498, 587, 599, 629, 721, 733, 744, 768, 850, 896, 940, 951, 976, 1015, 1098, 1215, 1229, 1247, 1250, 1255, 1274, 1275, 1276, 1277, 1278, 1279, 1280, 1281, 1282, 1283, 1284, 1285, 1286, 1287, 1288, 1289, 1290, 1291, 1292, 1293, 1294, 1295, 1296, 1297, 1298, 1299, 1300, 1301, 1302, and 1303 were associated with Goss' Wilt resistance by the methods of the present invention.

The present invention includes methods for breeding crop plants such as maize (Zea mays), soybean (Glycine max), cotton (Gossypium hirsutum), peanut (Arachis hypogaea), barley (Hordeum vulgare); oats (Avena sativa); orchard grass (Dactylis glomerata); rice (Oryza sativa, including indica and japonica varieties); sorghum (Sorghum bicolor); sugar cane (Saccharum sp); tall fescue (Festuca arundinacea); turfgrass species (e.g. species: Agrostis stolonifera, Poa pratensis, Stenotaphrum secundatum); wheat (Triticum aestivum), and alfalfa (Medicago sativa), members of the genus Brassica, broccoli, cabbage, carrot, cauliflower, Chinese cabbage, cucumber, dry bean, eggplant, fennel, garden beans, gourd, leek, lettuce, melon, okra, onion, pea, pepper, pumpkin, radish, spinach, squash, sweet corn, tomato, watermelon, ornamental plants, and other fruit, vegetable, tuber, and root crops.

It is appreciated by one skilled in the art that haploid plants can be generated from any generation of plant population and that the methods of the present invention can be used with one or more individuals, including SSD, from any generation of plant population. Non-limiting examples of plant populations include F1, F2, BC1, BC2F1, F3:F4, F2:F3, and so on, including subsequent filial generations, as well as experimental populations such as RILs and NILs. It is further anticipated that the degree of segregation within the one or more plant populations of the present invention can vary depending on the nature of the trait and germplasm under evaluation.

Marker-Trait Associations

For the purpose of QTL mapping, the markers included should be diagnostic of origin in order for inferences to be made about subsequent populations. SNP markers are ideal for mapping because the likelihood that a particular SNP allele is derived from independent origins in the extant populations of a particular species is very low. As such, SNP markers are useful for tracking and assisting introgression of QTLs, particularly in the case of haplotypes.

The genetic linkage of additional marker molecules can be established by a gene mapping model such as, without limitation, the flanking marker model reported by Lander et al., (Lander et al., 1989 Genetics, 121:185-199), and the interval mapping, based on maximum likelihood methods described therein, and implemented in the software package MAPMAKER/QTL (Lincoln and Lander, Mapping Genes Controlling Quantitative Traits Using MAPMAKER/QTL, Whitehead Institute for Biomedical Research, Massachusetts, (1990). Additional software includes Qgene, Version 2.23 (1996), Department of Plant Breeding and Biometry, 266 Emerson Hall, Cornell University, Ithaca, N.Y.). Use of Qgene software is a particularly preferred approach.

A maximum likelihood estimate (MLE) for the presence of a marker is calculated, together with an MLE assuming no QTL effect, to avoid false positives. A log₁₀ of an odds ratio (LOD) is then calculated as: LOD=log₁₀(MLE for the presence of a QTL/MLE given no linked QTL). The LOD score essentially indicates how much more likely the data are to have arisen assuming the presence of a QTL versus in its absence. The LOD threshold value for avoiding a false positive with a given confidence, say 95%, depends on the number of markers and the length of the genome. Graphs indicating LOD thresholds are set forth in Lander et al., (1989), and further described by Arús and Moreno-González, Plant Breeding, Hayward, Bosemark, Romagosa (eds.) Chapman & Hall, London, pp. 314-331 (1993).

Additional models can be used. Many modifications and alternative approaches to interval mapping have been reported, including the use of non-parametric methods (Kruglyak et al., 1995 Genetics, 139:1421-1428). Multiple regression methods or models can be also be used, in which the trait is regressed on a large number of markers (Jansen, Biometrics in Plant Breed, van Oijen, Jansen (eds.) Proceedings of the Ninth Meeting of the Eucarpia Section Biometrics in Plant Breeding, The Netherlands, pp. 116-124 (1994); Weber and Wricke, Advances in Plant Breeding, Blackwell, Berlin, 16 (1994)). Procedures combining interval mapping with regression analysis, whereby the phenotype is regressed onto a single putative QTL at a given marker interval, and at the same time onto a number of markers that serve as ‘cofactors,’ have been reported by Jansen et al. (Jansen et al., 1994 Genetics, 136:1447-1455) and Zeng (Zeng 1994 Genetics 136:1457-1468). Generally, the use of cofactors reduces the bias and sampling error of the estimated QTL positions (Utz and Melchinger, Biometrics in Plant Breeding, van Oijen, Jansen (eds.) Proceedings of the Ninth Meeting of the Eucarpia Section Biometrics in Plant Breeding, The Netherlands, pp. 195-204 (1994), thereby improving the precision and efficiency of QTL mapping (Zeng 1994). These models can be extended to multi-environment experiments to analyze genotype-environment interactions (Jansen et al., 1995 Theor. Appl. Genet. 91:33-3).

Selection of appropriate mapping populations is important to map construction. The choice of an appropriate mapping population depends on the type of marker systems employed (Tanksley et al., Molecular mapping in plant chromosomes. chromosome structure and function: Impact of new concepts J. P. Gustafson and R. Appels (eds.). Plenum Press, New York, pp. 157-173 (1988)). Consideration must be given to the source of parents (adapted vs. exotic) used in the mapping population. Chromosome pairing and recombination rates can be severely disturbed (suppressed) in wide crosses (adapted×exotic) and generally yield greatly reduced linkage distances. Wide crosses will usually provide segregating populations with a relatively large array of polymorphisms when compared to progeny in a narrow cross (adapted×adapted).

An F₂ population is the first generation of selfing. Usually a single F₁ plant is selfed to generate a population segregating for all the genes in Mendelian (1:2:1) fashion. Maximum genetic information is obtained from a completely classified F₂ population using a codominant marker system (Mather, Measurement of Linkage in Heredity: Methuen and Co., (1938)). In the case of dominant markers, progeny tests (e.g. F₃, BCF₂) are required to identify the heterozygotes, thus making it equivalent to a completely classified F₂ population. However, this procedure is often prohibitive because of the cost and time involved in progeny testing. Progeny testing of F₂ individuals is often used in map construction where phenotypes do not consistently reflect genotype (e.g. disease resistance) or where trait expression is controlled by a QTL. Segregation data from progeny test populations (e.g. F₃ or BCF₂) can be used in map construction. Marker-assisted selection can then be applied to cross progeny based on marker-trait map associations (F₂, F₃), where linkage groups have not been completely disassociated by recombination events (i.e., maximum disequilibrium).

Recombinant inbred lines (RIL) (genetically related lines; usually >F₅, developed from continuously selfing F₂ lines towards homozygosity) can be used as a mapping population. Information obtained from dominant markers can be maximized by using RIL because all loci are homozygous or nearly so. Under conditions of tight linkage (i.e., about <10% recombination), dominant and co-dominant markers evaluated in RIL populations provide more information per individual than either marker type in backcross populations (Reiter et al., 1992 Proc. Natl. Acad. Sci.(USA) 89:1477-1481). However, as the distance between markers becomes larger (i.e., loci become more independent), the information in RIL populations decreases dramatically.

Backcross populations (e.g., generated from a cross between a successful variety (recurrent parent) and another variety (donor parent) carrying a trait not present in the former) can be utilized as a mapping population. A series of backcrosses to the recurrent parent can be made to recover most of its desirable traits. Thus a population is created consisting of individuals nearly like the recurrent parent but each individual carries varying amounts or mosaic of genomic regions from the donor parent. Backcross populations can be useful for mapping dominant markers if all loci in the recurrent parent are homozygous and the donor and recurrent parent have contrasting polymorphic marker alleles (Reiter et al., 1992). Information obtained from backcross populations using either codominant or dominant markers is less than that obtained from F₂ populations because one, rather than two, recombinant gametes are sampled per plant. Backcross populations, however, are more informative (at low marker saturation) when compared to RILs as the distance between linked loci increases in RIL populations (i.e. about 0.15% recombination). Increased recombination can be beneficial for resolution of tight linkages, but may be undesirable in the construction of maps with low marker saturation.

Near-isogenic lines (NIL) created by many backcrosses to produce an array of individuals that are nearly identical in genetic composition except for the trait or genomic region under interrogation can be used as a mapping population. In mapping with NILs, only a portion of the polymorphic loci are expected to map to a selected region.

Bulk segregant analysis (BSA) is a method developed for the rapid identification of linkage between markers and traits of interest (Michelmore et al., 1991 Proc. Natl. Acad. Sci. (U.S.A.) 88:9828-9832). In BSA, two bulked DNA samples are drawn from a segregating population originating from a single cross. These bulks contain individuals that are identical for a particular trait (resistant or susceptible to particular disease) or genomic region but arbitrary at unlinked regions (i.e. heterozygous). Regions unlinked to the target region will not differ between the bulked samples of many individuals in BSA.

Marker-Assisted Breeding

Further, the present invention contemplates that preferred haploid plants comprising at least one genotype of interest are identified using the methods disclosed in U.S. Patent Application Ser. No. 60/837,864, which is incorporated herein by reference in its entirety, wherein a genotype of interest may correspond to a QTL or haplotype and is associated with at least one phenotype of interest. The methods include association of at least one haplotype with at least one phenotype, wherein the association is represented by a numerical value and the numerical value is used in the decision-making of a breeding program. Non-limiting examples of numerical values include haplotype effect estimates, haplotype frequencies, and breeding values. In the present invention, it is particularly useful to identify haploid plants of interest based on at least one genotype, such that only those lines undergo doubling, which saves resources. Resulting doubled haploid plants comprising at least one genotype of interest are then advanced in a breeding program for use in activities related to germplasm improvement.

In the present invention, haplotypes are defined on the basis of one or more polymorphic markers within a given haplotype window, with haplotype windows being distributed throughout the crop's genome. In another aspect, de novo and/or historical marker-phenotype association data are leveraged to infer haplotype effect estimates for one or more phenotypes for one or more of the haplotypes for a crop. Haplotype effect estimates enable one skilled in the art to make breeding decisions by comparing haplotype effect estimates for two or more haplotypes. Polymorphic markers, and respective map positions, of the present invention are provided in U.S. Patent Applications 2005/0204780, 2005/0216545, 2005/0218305, and Ser. No. 11/504,538, which are incorporated herein by reference in their entirety.

In yet another aspect, haplotype effect estimates are coupled with haplotype frequency values to calculate a haplotype breeding value of a specific haplotype relative to other haplotypes at the same haplotype window, or across haplotype windows, for one or more phenotypic traits. In other words, the change in population mean by fixing the haplotype is determined. In still another aspect, in the context of evaluating the effect of substituting a specific region in the genome, either by introgression or a transgenic event, haplotype breeding values are used as a basis in comparing haplotypes for substitution effects. Further, in hybrid crops, the breeding value of haplotypes is calculated in the context of at least one haplotype in a tester used to produce a hybrid. Once the value of haplotypes at a given haplotype window are determined and high density fingerprinting information is available on specific varieties or lines, selection can be applied to these genomic regions using at least one marker in the at least one haplotype.

In the present invention, selection can be applied at one or more stages of a breeding program:

Among genetically distinct populations, herein defined as “breeding populations,” as a pre-selection method to increase the selection index and drive the frequency of favorable haplotypes among breeding populations, wherein pre-selection is defined as selection among populations based on at least one haplotype for use as parents in breeding crosses, and leveraging of marker-trait association identified in previous breeding crosses.

a) Among segregating progeny from a breeding population, to increase the frequency of the favorable haplotypes for the purpose of line or variety development.

b) Among segregating progeny from a breeding population, to increase the frequency of the favorable haplotypes prior to QTL mapping within this breeding population.

c) For hybrid crops, among parental lines from different heterotic groups to predict the performance potential of different hybrids.

In the present invention, it is contemplated that methods of determining associations between genotype and phenotype in haploid plants can be performed based on haplotypes, versus markers alone (Fan et al., 2006 Genetics). A haplotype is a segment of DNA in the genome of an organism that is assumed to be identical by descent for different individuals when the knowledge of identity by state at one or more loci is the same in the different individuals, and that the regional amount of linkage disequilibrium in the vicinity of that segment on the physical or genetic map is high. A haplotype can be tracked through populations and its statistical association with a given trait can be analyzed. By searching the target space for a QTL association across multiple QTL mapping populations that have parental lines with genomic regions that are identical by descent, the effective population size associated with QTL mapping is increased. The increased sample size results in more recombinant progeny which increases the precision of estimating the QTL position.

Thus, a haplotype association study allows one to define the frequency and the type of the ancestral carrier haplotype. An “association study” is a genetic experiment where one tests the level of departure from randomness between the segregation of alleles at one or more marker loci and the value of individual phenotype for one or more traits. Association studies can be done on quantitative or categorical traits, accounting or not for population structure and/or stratification. In the present invention, associations between haplotypes and phenotypes for the determination of “haplotype effect estimates” can be conducted de novo, using mapping populations for the evaluation of one or more phenotypes, or using historical genotype and phenotype data.

A haplotype analysis is important in that it increases the statistical power of an analysis involving individual biallelic markers. In a first stage of a haplotype frequency analysis, the frequency of the possible haplotypes based on various combinations of the identified biallelic markers of the invention is determined. The haplotype frequency is then compared for distinct populations and a reference population. In general, any method known in the art to test whether a trait and a genotype show a statistically significant correlation may be used.

Methods for determining the statistical significance of a correlation between a phenotype and a genotype, in this case a haplotype, may be determined by any statistical test known in the art and with any accepted threshold of statistical significance being required. The application of particular methods and thresholds of significance are well within the skill of the ordinary practitioner of the art.

To estimate the frequency of a haplotype, the base reference germplasm has to be defined (collection of elite inbred lines, population of random mating individuals, etc.) and a representative sample (or the entire population) has to be genotyped. For example, in one aspect, haplotype frequency is determined by simple counting if considering a set of inbred individuals. In another aspect, estimation methods that employ computing techniques like the Expectation/Maximization (EM) algorithm are required if individuals genotyped are heterozygous at more than one locus in the segment and linkage phase is unknown (Excoffier et al., 1995 Mol. Biol. Evol. 12: 921-927; Li et al. 2002 Biostatistics). Preferably, a method based on the EM algorithm (Dempster et al., 1977 J. R. Stat. Soc. Ser. B 39:1-38) leading to maximum-likelihood estimates of haplotype frequencies under the assumption of Hardy-Weinberg proportions (random mating) is used (Excoffier et al., 1995 Mol. Biol. Evol. 12: 921-927). Alternative approaches are known in the art that for association studies: genome-wide association studies, candidate region association studies and candidate gene association studies (Li et al. 2006 BMC Bioinformatics 7:258). The polymorphic markers of the present invention may be incorporated in any map of genetic markers of a plant genome in order to perform genome-wide association studies.

The present invention comprises methods to detect an association between at least one haplotype in a haploid crop plant and a preferred trait, including a transgene, or a multiple trait index and calculate a haplotype effect estimate based on this association. In one aspect, the calculated haplotype effect estimates are used to make decisions in a breeding program. In another aspect, the calculated haplotype effect estimates are used in conjunction with the frequency of the at least one haplotype to calculate a haplotype breeding value that will be used to make decisions in a breeding program. A multiple trait index (MTI) is a numerical entity that is calculated through the combination of single trait values in a formula. Most often calculated as a linear combination of traits or normalized derivations of traits, it can also be the result of more sophisticated calculations (for example, use of ratios between traits). This MTI is used in genetic analysis as if it were a trait.

Any given chromosome segment can be represented in a given population by a number of haplotypes that can vary from 1 (region is fixed), to the size of the population times the ploidy level of that species (2 in a diploid species), in a population in which every chromosome has a different haplotype. Identity-by-descent among haplotype carried by multiple individuals in a non-fixed population will result in an intermediate number of haplotype and possibly a differing frequency among the different haplotypes. New haplotypes may arise through recombination at meiosis between existing haplotypes in heterozygous progenitors. The frequency of each haplotype may be estimated by several means known to one versed in the art (e.g. by direct counting, or by using an EM algorithm). Let us assume that “k” different haplotypes, identified as “h_(i)” (i=1, . . . , k), are known, that their frequency in the population is “f_(i)” (i=1, . . . , k), and for each of these haplotypes we have an effect estimate “Est_(i)” (i=1, . . . , k). If we call the “haplotype breeding value” (BV_(i)) the effect on that population of fixing that haplotype, then this breeding value corresponds to the change in mean for the trait(s) of interest of that population between its original state of haplotype distribution at the window and a final state at which haplotype “h_(i)” encounters itself at a frequency of 100%. The haplotype breeding value of h_(i) in this population is calculated as:

${BV}_{i} = {{Est}_{i} - {\sum\limits_{i = 1}^{k}{{Est}_{i}f_{i}}}}$

One skilled in the art will recognize that haplotypes that are rare in the population in which effects are estimated tend to be less precisely estimated, this difference of confidence may lead to adjustment in the calculation. For example one can ignore the effects of rare haplotypes, by calculating breeding value of better known haplotype after adjusting the frequency of these (by dividing it by the sum of frequency of the better known haplotypes). One could also provide confidence intervals for the breeding value of each haplotypes.

The present invention anticipates that any particular haplotype breeding value will change according to the population for which it is calculated, as a function of difference of haplotype frequencies. The term “population” will thus assume different meanings, below are two examples of special cases. In one aspect, a population is a single inbred in which one intends to replace its current haplotype h_(j) by a new haplotype h_(i), in this case BV_(i)=Est_(i)−Est_(j). In another aspect, a “population” is a F2 population in which the two parental haplotype h_(i) and h_(j) are originally present in equal frequency (50%), in which case BV_(i)=½(Est_(i)−Est_(j)).

These statistical approaches enable haplotype effect estimates to inform breeding decisions in multiple contexts. Other statistical approaches to calculate breeding values are known to those skilled in the art and can be used in substitution without departing from the spirit and scope of this invention.

In cases where conserved genetic segments, or haplotype windows, are coincident with segments in which QTL have been identified it is possible to deduce with high probability that QTL inferences can be extrapolated to other germplasm having an identical haplotype in that haplotype window. This a priori information provides the basis to select for favorable QTLs prior to QTL mapping within a given population.

For example, plant breeding decisions could comprise:

Selection among haploid breeding populations to determine which populations have the highest frequency of favorable haplotypes, wherein haplotypes are designated as favorable based on coincidence with previous QTL mapping and preferred populations undergo doubling; or

a) Selection of haploid progeny containing the favorable haplotypes in breeding populations prior to, or in substitution for, QTL mapping within that population, wherein selection could be done at any stage of breeding and at any generation of a selection and can be followed by doubling; or

b) Prediction of progeny performance for specific breeding crosses; or

c) Selection of haploid plants for doubling for subsequent use in germplasm improvement activities based on the favorable haplotypes, including line development, hybrid development, selection among transgenic events based on the breeding value of the haplotype that the transgene was inserted into, making breeding crosses, testing and advancing a plant through self fertilization, using plant or parts thereof for transformation, using plants or parts thereof for candidates for expression constructs, and using plant or parts thereof for mutagenesis.

In cases where haplotype windows are coincident with segments in which genes have been identified it is possible to deduce with high probability that gene inferences can be extrapolated to other germplasm having an identical genotype, or haplotype, in that haplotype window. This a priori information provides the basis to select for favorable genes or gene alleles on the basis of haplotype identification within a given population. For example, plant breeding decisions could comprise:

a) Selection among haploid breeding populations to determine which populations have the highest frequency of favorable haplotypes, wherein haplotypes are designated as favorable based on coincidence with previous gene mapping and preferred populations undergo doubling; or

b) Selection of haploid progeny containing the favorable haplotypes in breeding populations, wherein selection is effectively enabled at the gene level, wherein selection could be done at any stage of breeding and at any generation of a selection and can be followed by doubling; or

c) Prediction of progeny performance for specific breeding crosses; or

d) Selection of haploid plants for doubling for subsequent use in germplasm improvement activities based on the favorable haplotypes, including line development, hybrid development, selection among transgenic events based on the breeding value of the haplotype that the transgene was inserted into, making breeding crosses, testing and advancing a plant through self fertilization, using plant or parts thereof for transformation, using plants or parts thereof for candidates for expression constructs, and using plant or parts thereof for mutagenesis.

A preferred haplotype provides a preferred property to a parent plant and to the progeny of the parent when selected by a marker means or phenotypic means. The method of the present invention provides for selection of preferred haplotypes, or haplotypes of interest, and the accumulation of these haplotypes in a breeding population.

In the present invention, haplotypes and associations of haplotypes to one or more phenotypic traits provide the basis for making breeding decisions and germplasm improvement activities. Non-limiting examples of breeding decisions include progeny selection, parent selection, and recurrent selection for at least one haplotype. In another aspect, breeding decisions relating to development of plants for commercial release comprise advancing plants for testing, advancing plants for purity, purification of sublines during development, inbred development, variety development, and hybrid development. In yet other aspects, breeding decisions and germplasm improvement activities comprise transgenic event selection, making breeding crosses, testing and advancing a plant through self-fertilization, using plants or parts thereof for transformation, using plants or parts thereof for candidates for expression constructs, and using plants or parts thereof for mutagenesis.

In another embodiment, this invention enables indirect selection through selection decisions for at least one phenotype based on at least one numerical value that is correlated, either positively or negatively, with one or more other phenotypic traits. For example, a selection decision for any given haplotype effectively results in selection for multiple phenotypic traits that are associated with the haplotype.

In still another embodiment, the present invention acknowledges that preferred haplotypes identified by the methods presented herein may be advanced as candidate genes for inclusion in expression constructs, i.e., transgenes. Nucleic acids underlying haplotypes of interest may be expressed in plant cells by operably linking them to a promoter functional in plants. In another aspect, nucleic acids underlying haplotypes of interest may have their expression modified by double-stranded RNA-mediated gene suppression, also known as RNA interference (“RNAi”), which includes suppression mediated by small interfering RNAs (“siRNA”), trans-acting small interfering RNAs (“ta-siRNA”), or microRNAs (“miRNA”). Examples of RNAi methodology suitable for use in plants are described in detail in U.S. Patent Application Publications 2006/0200878 and 2007/0011775.

Methods are known in the art for assembling and introducing constructs into a cell in such a manner that the nucleic acid molecule for a trait is transcribed into a functional mRNA molecule that is translated and expressed as a protein product. For the practice of the present invention, conventional compositions and methods for preparing and using constructs and host cells are well known to one skilled in the art, see for example, Molecular Cloning: A Laboratory Manual, 3rd Edition Volumes 1, 2, and 3 (2000) J. F. Sambrook, D. W. Russell, and N. Irwin, Cold Spring Harbor Laboratory Press. Methods for making transformation constructs particularly suited to plant transformation include, without limitation, those described in U.S. Pat. Nos. 4,971,908, 4,940,835, 4,769,061 and 4,757,011, all of which are herein incorporated by reference in their entirety. Transformation methods for the introduction of expression units into plants are known in the art and include electroporation as illustrated in U.S. Pat. No. 5,384,253; microprojectile bombardment as illustrated in U.S. Pat. Nos. 5,015,580; 5,550,318; 5,538,880; 6,160,208; 6,399,861; and 6,403,865; protoplast transformation as illustrated in U.S. Pat. No. 5,508,184; and Agrobacterium-mediated transformation as illustrated in U.S. Pat. Nos. 5,635,055; 5,824,877; 5,591,616; 5,981,840; and 6,384,301.

Another preferred embodiment of the present invention is to build additional value by selecting a composition of haplotypes wherein each haplotype has a haplotype effect estimate that is not negative with respect to yield, or is not positive with respect to maturity, or is null with respect to maturity, or amongst the best 50 percent with respect to a phenotypic trait, transgene, and/or a multiple trait index when compared to any other haplotype at the same chromosome segment in a set of germplasm, or amongst the best 50 percent with respect to a phenotypic trait, transgene, and/or a multiple trait index when compared to any other haplotype across the entire genome in a set of germplasm, or the haplotype being present with a frequency of 75 percent or more in a breeding population or a set of germplasm provides evidence of its high value, or any combination of these.

This invention anticipates a stacking of haplotypes from multiple windows into plants or lines by crossing parent plants or lines containing different haplotype regions. The value of the plant or line comprising in its genome stacked haplotype regions is estimated by a composite breeding value, which depends on a combination of the value of the traits and the value of the haplotype(s) to which the traits are linked. The present invention further anticipates that the composite breeding value of a plant or line is improved by modifying the components of one or each of the haplotypes. Additionally, the present invention anticipates that additional value can be built into the composite breeding value of a plant or line by selection of at least one recipient haplotype with a preferred haplotype effect estimate or, in conjunction with the haplotype frequency, breeding value to which one or any of the other haplotypes are linked, or by selection of plants or lines for stacking haplotypes by breeding.

Another embodiment of this invention is a method for enhancing breeding populations by accumulation of one or more preferred haplotypes in a set of germplasm. Genomic regions defined as haplotype windows include genetic information that contribute to one or more phenotypic traits of the plant. Variations in the genetic information at one or more loci can result in variation of one or more phenotypic traits, wherein the value of the phenotype can be measured. The genetic mapping of the haplotype windows allows for a determination of linkage across haplotypes. A haplotype of interest has a DNA sequence that is novel in the genome of the progeny plant and can in itself serve as a genetic marker for the haplotype of interest. Notably, this marker can also be used as an identifier for a gene or QTL. For example, in the event of multiple traits or trait effects associated with the haplotype, only one marker would be necessary for selection purposes. Additionally, the haplotype of interest may provide a means to select for plants that have the linked haplotype region. Selection can be performed by screening for tolerance to an applied phytotoxic chemical, such as an herbicide or antibiotic, or to pathogen resistance. Selection may be performed using phenotypic selection means, such as, a morphological phenotype that is easy to observe such as seed color, seed germination characteristic, seedling growth characteristic, leaf appearance, plant architecture, plant height, and flower and fruit morphology.

The present invention also provides for the screening of progeny haploid plants for haplotypes of interest and using haplotype effect estimates as the basis for selection for use in a breeding program to enhance the accumulation of preferred haplotypes. The method includes: a) providing a breeding population comprising at least two haploid plants wherein the genome of the breeding population comprises a plurality of haplotype windows and each of the plurality of haplotype windows comprises at least one haplotype; and b) associating a haplotype effect estimate for one or more traits for two or more haplotypes from one or more of the plurality of haplotype windows, wherein the haplotype effect estimate can then be used to calculate a breeding value that is a function of the estimated effect for any given phenotypic trait and the frequency of each of the at least two haplotypes; and c) ranking one or more of the haplotypes on the basis of a value, wherein the value is a haplotype effect estimate, a haplotype frequency, or a breeding value and wherein the value is the basis for determining whether a haplotype is a preferred haplotype, or haplotype of interest; and d) utilizing the ranking as the basis for decision-making in a breeding program; and e) at least one progeny haploid plant is selected for doubling on the basis of the presence of the respective markers associated with the haplotypes of interest, wherein the progeny haploid plant comprises in its genome at least a portion of the haplotype or haplotypes of interest of the first plant and at least one preferred haplotype of the second plant; and f) using resulting doubled haploid plants in activities related to germplasm improvement wherein the activities are selected from the group consisting of line and variety development, hybrid development, transgenic event selection, making breeding crosses, testing and advancing a plant through self fertilization, using plant or parts thereof for transformation, using plants or parts thereof for candidates for expression constructs, and using plant or parts thereof for mutagenesis.

Using this method, the present invention contemplates that haplotypes of interest are selected from a large population of plants, and the selected haplotypes can have a synergistic breeding value in the germplasm of a crop plant. Additionally, this invention provides for using the selected haplotypes in the described breeding methods to accumulate other beneficial and preferred haplotype regions and to be maintained in a breeding population to enhance the overall germplasm of the crop plant.

Plant Breeding

Plants of the present invention can be part of or generated from a breeding program. The choice of breeding method depends on the mode of plant reproduction, the heritability of the trait(s) being improved, and the type of cultivar used commercially (e.g., F₁, hybrid cultivar, pureline cultivar, etc). A cultivar is a race or variety of a plant species that has been created or selected intentionally and maintained through cultivation.

Selected, non-limiting approaches for breeding the plants of the present invention are set forth below. A breeding program can be enhanced using marker assisted selection (MAS) on the progeny of any cross. It is understood that nucleic acid markers of the present invention can be used in a MAS (breeding) program. It is further understood that any commercial and non-commercial cultivars can be utilized in a breeding program. Factors such as, for example, emergence vigor, vegetative vigor, stress tolerance, disease resistance, branching, flowering, seed set, seed size, seed density, standability, and threshability etc. will generally dictate the choice.

Genotyping can be further economized by high throughput, non-destructive seed sampling. In one embodiment, plants can be screened for one or more markers, such as genetic markers, using high throughput, non-destructive seed sampling. In a preferred aspect, haploid seed is sampled in this manner and only seed with at least one marker genotype of interest is advanced for doubling. Apparatus and methods for the high-throughput, non-destructive sampling of seeds have been described which would overcome the obstacles of statistical samples by allowing for individual seed analysis. For example, U.S. patent application Ser. No. 11/213,430 (filed Aug. 26, 2005); U.S. patent application Ser. No. 11/213,431 (filed Aug. 26, 2005); U.S. patent application Ser. No. 11/213,432 (filed Aug. 26, 2005); U.S. patent application Ser. No. 11/213,434 (filed Aug. 26, 2005); and U.S. patent application Ser. No. 11/213,435 (filed Aug. 26, 2005), U.S. patent application Ser. No. 11/680,611 (filed Mar. 2, 2007), which are incorporated herein by reference in their entirety, disclose apparatus and systems for the automated sampling of seeds as well as methods of sampling, testing and bulking seeds.

For highly heritable traits, a choice of superior individual plants evaluated at a single location will be effective, whereas for traits with low heritability, selection should be based on mean values obtained from replicated evaluations of families of related plants. Popular selection methods commonly include pedigree selection, modified pedigree selection, mass selection, and recurrent selection. In a preferred aspect, a backcross or recurrent breeding program is undertaken.

The complexity of inheritance influences choice of the breeding method. Backcross breeding can be used to transfer one or a few favorable genes for a highly heritable trait into a desirable cultivar. This approach has been used extensively for breeding disease-resistant cultivars. Various recurrent selection techniques are used to improve quantitatively inherited traits controlled by numerous genes.

Breeding lines can be tested and compared to appropriate standards in environments representative of the commercial target area(s) for two or more generations. The best lines are candidates for new commercial cultivars; those still deficient in traits may be used as parents to produce new populations for further selection.

The development of new elite corn hybrids requires the development and selection of elite inbred lines, the crossing of these lines and selection of superior hybrid crosses. The hybrid seed can be produced by manual crosses between selected male-fertile parents or by using male sterility systems. Additional data on parental lines, as well as the phenotype of the hybrid, influence the breeder's decision whether to continue with the specific hybrid cross.

Pedigree breeding and recurrent selection breeding methods can be used to develop cultivars from breeding populations. Breeding programs combine desirable traits from two or more cultivars or various broad-based sources into breeding pools from which cultivars are developed by selfing and selection of desired phenotypes. New cultivars can be evaluated to determine which have commercial potential.

Backcross breeding has been used to transfer genes for a simply inherited, highly heritable trait into a desirable homozygous cultivar or inbred line, which is the recurrent parent. The source of the trait to be transferred is called the donor parent. After the initial cross, individuals possessing the phenotype of the donor parent are selected and repeatedly crossed (backcrossed) to the recurrent parent. The resulting plant is expected to have most attributes of the recurrent parent (e.g., cultivar) and, in addition, the desirable trait transferred from the donor parent.

The single-seed descent procedure in the strict sense refers to planting a segregating population, harvesting a sample of one seed per plant, and using the one-seed sample to plant the next generation. When the population has been advanced from the F₂ to the desired level of inbreeding, the plants from which lines are derived will each trace to different F₂ individuals. The number of plants in a population declines each generation due to failure of some seeds to germinate or some plants to produce at least one seed. As a result, not all of the F₂ plants originally sampled in the population will be represented by a progeny when generation advance is completed.

Descriptions of other breeding methods that are commonly used for different traits and crops can be found in one of several reference books (Allard, “Principles of Plant Breeding,” John Wiley & Sons, NY, U. of CA, Davis, Calif., 50-98, 1960; Simmonds, “Principles of Crop Improvement,” Longman, Inc., NY, 369-399, 1979; Sneep and Hendriksen, “Plant Breeding Perspectives,” Wageningen (ed), Center for Agricultural Publishing and Documentation, 1979; Fehr, In: Soybeans: Improvement, Production and Uses, 2nd Edition, Manograph., 16:249, 1987; Fehr, “Principles of Variety Development,” Theory and Technique, (Vol. 1) and Crop Species Soybean (Vol. 2), Iowa State Univ., Macmillan Pub. Co., NY, 360-376, 1987).

An alternative to traditional QTL mapping involves achieving higher resolution by mapping haplotypes, versus individual markers (Fan et al., 2006 Genetics 172:663-686). This approach tracks blocks of DNA known as haplotypes, as defined by polymorphic markers, which are assumed to be identical by descent in the mapping population. This assumption results in a larger effective sample size, offering greater resolution of QTL. Methods for determining the statistical significance of a correlation between a phenotype and a genotype, in this case a haplotype, may be determined by any statistical test known in the art and with any accepted threshold of statistical significance being required. The application of particular methods and thresholds of significance are well with in the skill of the ordinary practitioner of the art.

It is further understood, that the present invention provides bacterial, viral, microbial, insect, mammalian and plant cells comprising the nucleic acid molecules of the present invention.

As used herein, a “nucleic acid molecule,” be it a naturally occurring molecule or otherwise may be “substantially purified”, if desired, referring to a molecule separated from substantially all other molecules normally associated with it in its native state. More preferably a substantially purified molecule is the predominant species present in a preparation. A substantially purified molecule may be greater than 60% free, preferably 75% free, more preferably 90% free, and most preferably 95% free from the other molecules (exclusive of solvent) present in the natural mixture. The term “substantially purified” is not intended to encompass molecules present in their native state.

The agents of the present invention will preferably be “biologically active” with respect to either a structural attribute, such as the capacity of a nucleic acid to hybridize to another nucleic acid molecule, or the ability of a protein to be bound by an antibody (or to compete with another molecule for such binding). Alternatively, such an attribute may be catalytic, and thus involve the capacity of the agent to mediate a chemical reaction or response.

The agents of the present invention may also be recombinant. As used herein, the term recombinant means any agent (e.g. DNA, peptide etc.), that is, or results, however indirect, from human manipulation of a nucleic acid molecule.

The agents of the present invention may be labeled with reagents that facilitate detection of the agent (e.g. fluorescent labels (Prober et al., 1987 Science 238:336-340; Albarella et al., European Patent 144914), chemical labels (Sheldon et al., U.S. Pat. No. 4,582,789; Albarella et al., U.S. Pat. No. 4,563,417), modified bases (Miyoshi et al., European Patent 119448).

The present invention provides methods to identify and use QTL and haplotype information by screening haploid material that enables a breeder to make informed breeding decisions. The methods and compositions of the present invention enable the determination of at least one genotype of interest from one or more haploid plants. In another aspect, a haploid plant comprising at least one genotype of interest can undergo doubling and be advanced in a breeding program. In yet another aspect, a priori QTL and haplotype information can be leveraged, as disclosed in U.S. Patent Application Ser. No. 60/837, 864, which is incorporated herein by reference in its entirety, using markers underlying at least one haplotype window, and the resulting fingerprint is used to identify the haplotypic composition of the haplotype window which is subsequently associated with one or more haplotype effect estimates for one or more phenotypic traits as disclosed therein. This information is valuable in decision-making for a breeder because it enables a selection decision to be based on estimated phenotype without having to phenotype the plant per se. Further, it is preferred to make decisions based on genotype rather than phenotype due the fact phenotype is influenced by multiple biotic and abiotic factors that can confound evaluation of any given trait and performance prediction. As used herein, the invention allows the identification of one or more preferred haploid plants such that only preferred plants undergo the doubling process, thus economizing the DH process.

In another aspect, one or more haplotypes are determined by genotyping one or more haploid plants using markers for one or more haplotype windows. The breeder is able to correspond the haplotypes with their respective haplotype effect estimates for one or more phenotypes of interest and make a decision based on the preferred haplotype. Haploid plants comprising one or more preferred haplotypes are doubled using one or more methods known in the art and then advanced in the breeding program.

In one aspect, advancement decisions in line development breeding are traditionally made based on phenotype, wherein decisions are made between two or more plants showing segregation for one or more phenotypic traits. An advantage of the present invention is the ability to make decisions based on haplotypes wherein a priori information is leveraged, enabling “predictive breeding.” In this aspect, during line development breeding for a crop plant, sublines are evaluated for segregation at one or more marker loci. Individuals segregating at one or more haplotype windows can be identified unambiguously using genotyping and, for any given haplotype window, individuals comprising the preferred haplotype are selected. In preferred aspects, the selection decision is based on a haplotype effect estimate, a haplotype frequency, or a breeding value.

Having now generally described the invention, the same will be more readily understood through reference to the following examples which are provided by way of illustration, and are not intended to be limiting of the present invention, unless specified.

EXAMPLES

The following examples are included to demonstrate preferred embodiments of the invention. It should be appreciated by those of skill in the art that the techniques disclosed in the examples which follow represent techniques discovered by the inventors to function well in the practice of the invention, and thus can be considered to constitute preferred modes for its practice. However, those of skill in the art should, in light of the present disclosure, appreciate that many changes can be made in the specific embodiments which are disclosed and still obtain a like or similar result without departing from the concept, spirit and scope of the invention.

Example 1 Phenotyping for Gray Leaf Spot (GLS) Reaction

Haploid plants can be used to genetically map QTL associated with resistance to Gray Leaf Spot (GLS). In order to detect QTL associated with GLS resistance, plants were phenotyped to determine GLS reaction. The following rating scale was used for phenotypic rating for GLS was used in all studies. The percentage of leaf area infected is used to rate plants on a scale of 1 (very resistant) to 9 (susceptible). Disease reaction is visually evaluated after pollination. The infection can be natural or by artificial inoculation in the experiments.

TABLE 1 Description of rating scale for GLS phenotyping Description Rating Symptoms Very Resistant 1 0% of leaf area infected; no visible lesions Very Resistant 2 ILA < 1%; few lesions, dispersed through lower leaves Resistant 3 1% ≦ ILA ≦ 20% Resistant 4 20% ≦ ILA ≦ 40% Mid-resistant 5 40% ≦ ILA ≦ 50%; lesions reaching ear leaf, with spare lesions in the leaves above the ear Mid-Susceptible 6 50% ≦ ILA ≦ 60%; lesions reaching the leaves above the ear Susceptible 7 60% ≦ ILA ≦ 75% Susceptible 8 75% ≦ ILA ≦ 90% Susceptible 9 >90% of foliar area infected, with premature death of the plant before forming black layer ILA = infected leaf area.

Example 2 Phenotyping for Goss' Wilt Reaction

Haploid plants can be used to genetically map QTL associated with resistance to Goss' Wilt. In order to detect QTL associated with resistance to Goss' Wilt, plants were phenotyped to determine Goss' Wilt reaction. The following rating scale was used in order to assess resistance or susceptibility to Goss' Wilt. Phenotypic evaluations of Goss' Wilt reaction is based on percentage of infected leaf area and rated according to a 1 (very resistant) to 9 (susceptible) scale (Table 2). Plants are artificially inoculated and visually rated approximately 3 to 4 weeks after pollination.

TABLE 2 Description of rating scale for Goss' Wilt phenotyping Description Rating Symptoms Very Resistant 1 0% of leaf area infected; no visible lesions Very Resistant 2 ILA < 1%; few lesions, dispersed through lower leaves Resistant 3 1% ≦ ILA ≦ 20% Resistant 4 20% ≦ ILA ≦ 40% Mid-resistant 5 40% ≦ ILA ≦ 50% Mid-Susceptible 6 50% ≦ ILA ≦ 60%; lesions Susceptible 7 60% ≦ ILA ≦ 75% Susceptible 8 75% ≦ ILA ≦ 90% Susceptible 9 >90% of foliar area infected ILA = infected leaf area.

Example 3 Haploid Mapping Study for GLS Resistance with I133314/I206447 Population

The utility of haploid plants in genetic mapping of traits of interest is demonstrated in the following example. A haploid population was developed by crossing the inbred corn lines I133314 by I206447 and then inducing the resulting F1 hybrid to produce 1945 haploid plants. For mapping, 82 SNP markers were used to screen the haploid population. Phenotypic data relating to GLS reaction were collected on the population. Composite interval mapping was conducted to examine significant associations between GLS and the SNP markers. Table 3 provides the significant marker associations found in this study. QTL associated with GLS resistance were identified by genetic mapping with haploid plants. The source of the favorable allele for GLS resistance was I206447 for all markers except NC0151453 (SEQ ID NO: 1231) in which the source of the favorable allele was I133314. The chromosome (Chr.) location, chromosome position (Chr. pos), and favorable allele are provided for each marker in Table 3.

TABLE 3 Markers useful for detecting QTL associated with GLS in the I133314/I206447 haploid mapping population. Chr. GLS Favorable SEQ ID SNP Marker Chr. position QTL LOD Effect Allele Marker Position* NC0147103 1 39.1 177 6.15 0.17 C 1228 1001 NC0202383 2 19 2 20.38 0.30 T 1229 34 NC0201657 2 179.2 178 26.30 0.34 T 1230 342 NC0055894 3 112.4 57 6.74 0.17 T 421 202 NC0151453 6 75.1 110 17.17 −0.28 T 1231 119 *SNP position: refers to position of the SNP polymorphism in the indicated SEQ ID NO.

Example 4 Haploid Mapping Study for GLS resistance with I294213/I283669 Population

The utility of haploid plants in genetic mapping of traits of interest is demonstrated in the following example. A haploid mapping population was developed by crossing the inbred corn lines I294213 by I283669. The resulting F1 hybrid was induced to produce 1895 haploid seed. For mapping, 82 SNP markers were used to screen the haploid population. Composite interval mapping was conducted to examine significant associations between GLS and the SNP markers. Table 4 provides the significant marker associations found in this study. QTL associated with GLS resistance were identified by genetic mapping with haploid plants. The source of the favorable alleles was I283669 for all makers except NC0003425 (SEQ ID NO: 1127) in which the source of the favorable allele was I294213. The chromosome (Chr.) location, chromosome position (Chr. pos), and favorable allele are provided for each marker in Table 4.

TABLE 4 Markers useful for detecting QTL associated with GLS resistance in the I294213/I283669 haploid mapping population. Chr. GLS Favorable SEQ ID SNP Marker Chr. Position QTL LOD Effect Allele Marker Position* NC0052741 1 49.5 5 136.19 0.75 G 36 411 NC0028145 3 187.5 64 3.19 0.10 G 481 307 NC0143354 5 1.8 88 8.33 0.16 C 659 303 NC0040408 6 59.1 108 13.29 0.22 T 1232 336 NC0109097 7 93.8 127 5.93 0.13 T 1233 97 NC0003425 9 84.5 158 20.30 −0.25 G 1127 280 NC0199588 10 99.9 173 15.36 0.23 G 1219 137 *SNP Position: refers to the position of the SNP polymorphism in the indicated SEQ ID NO.

Example 5 Haploid Mapping Study for Goss' Wilt with I208993/LH287 Population

The utility of haploid plants in genetic mapping of traits of interest is further demonstrated in the following example. A mapping population was developed for using haploid plants to map QTL associated with resistance to Goss' Wilt. The population was from the cross of inbred corn lines I208993 by LH287. F1 plants were induced to produce haploid seed. From the I208993/LH287 population, 1384 haploid plants were inoculated with the Goss' Wilt pathogen and phenotyped using a truncated rating scale of 1, 5, or 9. Ratings are done approximately 3 to 4 weeks after pollination. Plants rated either 1 or 9 were used in the QTL mapping. By using only the extreme values (1 or 9), environmental variation that is inherent with disease phenotyping was reduced and a bulk segregate analysis was created from which to detect major QTL. Genotyping was done using 114 SNP markers. Composite interval mapping was conducted to examine significant associations between Goss' Wilt and SNP markers. Table 5 provides markers useful for detecting QTL associated with resistance to Goss' Wilt in the I208993/LH287 haploid mapping population. The chromosome (Chr.) location, chromosome position (Chr. pos), and favorable (Fav.) allele are also provided in Table 5.

TABLE 5 Markers useful for detecting QTL associated with Goss' Wilt resistance in the I208993/LH287 haploid mapping population. Goss' Chr Wilt Likelyhood Additive Fav. SEQ SNP Marker Chr pos. QTL ratio LOD effect allele ID Position* NC0202383 2 19 22 100.304 21.78074 0.737618 T 1229 34 NC0199732 2 37 24 113.9429 24.74239 0.779994 T 1276 138 NC0048553 2 46.8 25 103.8964 22.56081 0.758496 A 234 485 NC0201646 2 55.4 129 96.43437 20.94046 0.746649 T 1294 416 NC0201821 2 71.4 27 40.13758 8.715765 0.202738 T 1295 331 NC0019110 2 75.1 27 28.41102 6.169374 0.173568 C 1278 153 NC0004821 3 54.4 40 47.57959 10.33178 0.451741 C 371 294 NC0200643 3 70.3 122 47.48045 10.31025 0.424893 C 1296 106 NC0040461 4 51.2 125 80.02493 17.37719 0.620383 A 1282 366 NC0034462 4 67.8 52 76.55974 16.62474 0.574876 T 1250 301 NC0200535 4 132 58 29.47242 6.399855 0.142544 T 1297 411 NC0029435 4 138 58 29.25183 6.351953 0.139488 G 1298 551 NC0011194 5 29.3 63 27.51088 5.973912 −0.227689 C 1299 218 NC0016527 5 49 66 29.15712 6.331388 −0.219392 T 1255 351 NC0202055 5 76.4 68 26.18668 5.686366 −0.252002 T 1300 505 NC0147719 5 160 130 47.9265 10.40711 0.492815 G 1301 48 NC0012417 5 175 74 48.68852 10.57258 0.505586 T 768 137 NC0113381 6 83.8 79 28.96126 6.288858 −0.21407 A 850 303 NC0022200 6 93.7 80 31.16025 6.766361 −0.201408 G 1302 153 NC0010347 8 69.2 131 27.38218 5.945966 −0.144382 T 1015 160 NC0199582 8 86.3 99 26.24576 5.699195 −0.169537 A 1303 201 *SNP position: refers to the position of the SNP polymorphism in the indicated SEQ ID NO.

Example 6 Haploid Mapping Study for Goss' Wilt with I208993/LH295 Population

The utility of haploid plants in genetic mapping of traits of interest is further demonstrated in the following example. A mapping population was developed for using haploid plants to map QTL associated with resistance to Goss' Wilt. The population was from the cross of LH295 by I208993. F1 plants were induced to produce haploid seed.

From the I208993/LH295 haploid mapping population, 980 individuals were naturally exposed to the Goss' Wilt pathogen and phenotyped using a modified rating scale of 1, 5, or 9. Plants were rated approximately 3 to 4 weeks after pollination. Plants rated either 1 or 9 were used in the QTL mapping. By using only the extreme values (1 or 9), environmental variation that is inherent with disease phenotyping was reduced and a bulk segregate analysis was created from which to detect major QTL. Genotyping was done with 980 SNP markers. Table 6 provides markers useful for detecting QTL associated with Goss' Wilt in the I208993/LH295 haploid mapping population.

TABLE 6 Markers useful for detecting QTL associated with Goss' Wilt in the I208993/LH295 haploid mapping population Goss' Chr. Wilt Additive Favorable SEQ SNP Marker Chr. pos QTL Likelyhood LOD Effect Allele ID Position* NC0199051 1 19.3 1 28.02118 6.084721 −0.22604 G 1274 141 NC0105051 1 31.4 3 28.79147 6.251987 −0.236914 C 24 426 NC0032288 1 133.6 10 31.20763 6.77665 0.252864 C 1275 413 NC0070305 1 166.5 13 29.73574 6.457033 0.216902 A 158 532 NC0143411 2 15.4 22 31.80736 6.90688 −0.372898 C 218 401 NC0199732 2 37 24 51.17309 11.11209 −0.506613 T 1276 138 NC0013275 2 49.7 25 56.78186 12.33002 −0.677671 T 236 430 NC0199350 2 67.8 26 57.35414 12.45429 −0.577154 G 1277 226 NC0019110 2 75.1 27 51.54673 11.19323 −0.633508 C 1278 153 NC0027319 2 93.2 29 41.90672 9.099928 −0.572435 T 272 54 NC0104528 3 24.6 37 29.36476 6.376476 −0.189689 G 1247 117 NC0019963 3 40.6 39 32.03588 6.956503 −0.139199 C 368 1173 NC0077220 3 43.2 39 27.90631 6.059777 −0.133108 A 1279 149 NC0108727 3 77.4 122 32.5836 7.075438 −0.031362 C 375 241 NC0039785 3 94.5 123 30.35128 6.590696 −0.083537 T 401 512 NC0031720 3 99.7 123 46.9907 10.2039 0.199348 G 408 434 NC0200377 3 116.9 43 47.01889 10.21002 0.181809 A 1280 352 NC0199741 3 125.7 44 28.60384 6.211245 −0.315998 A 1281 294 NC0041040 3 145.4 45 36.85657 8.003303 −0.551354 A 440 497 NC0055502 4 1.8 124 36.00788 7.819012 −0.390433 C 498 105 NC0040461 4 51.2 125 42.90587 9.316891 −0.469569 A 1282 366 NC0199420 4 102.9 55 43.93528 9.540424 −0.452635 G 1283 356 NC0036240 4 112 56 38.3635 8.330528 −0.381557 A 587 441 NC0028933 4 127.6 57 29.32225 6.367245 0.144007 C 599 355 NC0147712 4 136.7 58 33.6318 7.303051 0.185174 A 1284 74 NC0028579 4 155.7 60 37.46012 8.134361 0.109588 A 629 242 NC0029487 4 171.1 126 38.35712 8.329143 0.101598 G 1285 159 NC0200359 5 11.7 63 27.52949 5.977952 −0.167336 A 1286 196 NC0040571 5 88.4 69 59.435 12.90615 −0.58299 C 721 154 NC0017678 5 103.8 71 69.69769 15.13466 −0.722151 A 733 171 NC0083876 5 124 72 29.09207 6.317263 −0.392793 T 744 513 NC0200323 5 174.8 74 27.01332 5.865868 −0.253474 A 1287 181 NC0027347 7 43.8 86 57.87354 12.56708 −0.542521 A 896 128 NC0201872 7 64.4 88 58.07534 12.6109 −0.54188 C 1288 208 NC0145922 7 80.5 89 26.87412 5.835642 −0.271008 G 940 451 NC0071001 7 99.4 90 26.59882 5.775861 −0.262452 T 951 359 NC0199879 7 112.1 92 34.51543 7.494931 −0.28773 A 1289 228 NC0200055 7 122.3 127 36.14355 7.848472 −0.277751 T 1290 116 NC0110771 7 138.5 93 32.98577 7.162769 −0.163457 A 976 490 NC0200495 7 155.9 95 27.69812 6.014571 −0.118782 G 1291 302 NC0028095 9 59.4 107 29.92602 6.498353 0.142796 C 1098 116 NC0144850 9 67 108 30.50354 6.62376 0.146897 G 1292 244 NC0030134 10 79.4 120 27.87616 6.05323 −0.317779 TCCACTAT 1215 94 NC0200312 10 85.7 128 31.10615 6.754615 −0.355789 A 1293 89 *SNP position: refers to the position of the SNP polymorphism in the indicated SEQ ID NO.

Example 7 Preselection of Haploids for Doubling

The utility of haploid plants in genetic mapping of traits of interest is demonstrated in the following example. A haploid mapping population is developed by inducing a Family based pedigree, such as an F3 or BC 1S 1, to produce haploid seeds. The haploid seeds are planted in ear rows which represent the parents from the F3 or BC1S1 population and remnant seed is stored for doubling needs after phenotyping is completed. For mapping, SNP markers are used to screen the haploid population. Composite interval mapping is conducted to examine significant associations between a trait of interest and the SNP markers. Such traits can include but are not limited to, disease resistance, herbicide tolerance, insect or pest resistance, altered fatty acid, protein or carbohydrate metabolism, increased grain yield, increased oil, enhanced nutritional content, increased growth rates, enhanced stress tolerance, preferred maturity, enhanced organoleptic properties, altered morphological characteristics, sterility, other agronomic traits, traits for industrial uses, or traits for consumer appeal. Remnant seed can be doubled through methods known in the art. Genotypic and phenotypic data can be used in selection of which remnant seed families to double. Doubled plants can be utilized for further breeding, commercial breeding or for additional fine-mapping purposes.

Example 9 Use of Haploid Seed for Preselection in a High Oil Breeding Program

The methods of the present invention can be used in a high oil corn breeding program. Haploid kernels with at least one preferred marker, such as oil content, can be selected according to the present invention. Preselection breeding methods are utilized to preselect and prescreen lines for oil and agronomic traits such as yield, using markers selected from the group consisting of genetic markers, protein composition, protein levels, oil composition, oil levels, carbohydrate composition, carbohydrate levels, fatty acid composition, fatty acid levels, amino acid composition, amino acid levels, biopolymers, pharmaceuticals, starch composition, starch levels, fermentable starch, fermentation yield, fermentation efficiency, energy yield, secondary compounds, metabolites, morphological characteristics, and agronomic characteristics.

Populations are identified for submission to the doubled haploid (DH) process. QTL and/or genomic regions of interest are identified in one or more parents in the population for targets of selection that are associated with improved agronomic traits such as yield, moisture, and testweight. In other aspects, QTL are identified that are associated with improved oil composition and/or increased oil composition. In one aspect, two or more QTL are selected.

The population undergoing haploid induction is characterized for oil content using methods known in the art, non-limiting examples of which include NIT, NIR, NMR, and MRI, wherein seed is measured in a bulk and/or on a single seed basis. Methods to measure oil content in single seeds have been described (Kotyk, J., et al., Journal of American Oil Chemists' Society 82: 855-862 (2005). In one aspect, single-kernel analysis (SKA) is conducted via magnetic resonance or other methods. In another aspect, oil content is measured using analytics methods known in the art per ear and the selected ears are bulked before undergoing SKA. The resulting data is used to select single kernels that fall within an oil range acceptable by the breeder to meet the product concept.

The selected population is sent to the DH facilities and induced. Putative haploid kernels are selected and non-destructively sampled for subsequent genotyping. Apparatus and methods for the high-throughput, non-destructive sampling of seeds have been described which would overcome the obstacles of statistical samples by allowing for individual seed analysis. For example, U.S. patent application Ser. No. 11/213,430 (filed Aug. 26, 2005); U.S. patent application Ser. No. 11/213,431 (filed Aug. 26, 2005); U.S. patent application Ser. No. 11/213,432 (filed Aug. 26, 2005); U.S. patent application Ser. No. 11/213,434 (filed Aug. 26, 2005); and U.S. patent application Ser. No. 11/213,435 (filed Aug. 26, 2005), U.S. patent application Ser. No. 11/680,611 (filed Mar. 2, 2007), which are incorporated herein by reference in their entirety, disclose apparatus and systems for the automated sampling of seeds as well as methods of sampling, testing and bulking seeds.

The seed samples are genotyped using the markers corresponding to the one or more QTL of interest. Seeds are selected based upon their genotypes for these QTL.

Seed may be selected based on preferred QTL alleles or, for the purpose of additional mapping, both ends of the distribution are selected. That is, seed is selected based on preferred and less preferred alleles for at least one QTL and/or preferred and less preferred phenotypic performance for at least one phenotype and/or preferred and less preferred predicted phenotypic performance for at least one phenotype.

Haploid kernels can also be selected and processed by methods known in the art such as NMR or MRI to characterize oil content. Kernels with preferred oil content are selected. As illustrated above, for research purposes, kernels may be selected with low, high, or average oil content in order to identify the genetic basis for oil content. In one aspect, relative oil content in germ and endosperm is characterized by taking an NMR measurement on whole kernel, wherein subsequent NMR measurements are taken on dissected germ and endosperm. In another aspect, kernels are imaged using MRI to identify the relative oil content in germ and endosperm tissue.

In another aspect, seed samples are analyzed for oil content and the genotype for at least one QTL or genomic region of interest, enabling pre-selection for high oil corn with suitable agronomic performance.

The selected haploid kernels are then doubled on the basis of analytic and/or genotypic data. In one aspect, following doubling, the putative DHs can be screened using genetic and/or analytic methods as described above.

Notably, analytic methods for detection of oil are not restricted to NMR and other relevant methods include IR-type instruments and MRI. Also, samples can be in bulk or on a single seed basis wherein the capability exists to select material based on a preferred oil content. In certain aspects, a preferred oil content is a decreased oil content which may be useful in the development of mapping populations for the detection of oil content QTL.

Selected haploids plants can be used to map for oil traits.

In view of the foregoing, it will be seen that the several advantages of the invention are achieved and attained.

The embodiments were chosen and described in order to best explain the principles of the invention and its practical application to thereby enable others skilled in the art to best utilize the invention in various embodiments and with various modifications as are suited to the particular use contemplated.

Various patent and non-patent publications are cited herein, the disclosures of each of which are, to the extent necessary, incorporated herein by reference in their entireties.

As various modifications could be made in the constructions and methods herein described and illustrated without departing from the scope of the invention, it is intended that all matter contained in the foregoing description or shown in the accompanying drawings shall be interpreted as illustrative rather than limiting. The breadth and scope of the present invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims appended hereto and their equivalents. 

1. A method for the association of at least one genotype with at least one phenotype using a haploid plant comprising: a) assaying at least one genotype of at least one haploid plant with at least one genetic marker; and b) associating the at least one marker with at least one phenotypic trait.
 2. The method of claim 1, wherein the at least one genetic marker comprises a single nucleotide polymorphism (SNP), an insertion or deletion in DNA sequence (Indel), a simple sequence repeat of DNA sequence (SSR) a restriction fragment length polymorphism, a haplotype, or a tag SNP.
 3. A method of claim 1, further comprising the step of using an association determined in step (b) to make a selection in a plant breeding program.
 4. A method of claim 3, wherein said selection comprises selecting among breeding populations based on the at least one genotype.
 5. The method of claim 3, wherein said selection comprises selecting progeny in one or more breeding populations based on the at least one genotype.
 6. The method of claim 3, wherein said selection comprises selecting among parental lines based on prediction of progeny performance.
 7. The method of claim 3, wherein said selection comprises selection of a line for advancement in a germplasm improvement activity based on the at least one genotype.
 8. A method of claim 3, further comprising the step of doubling at least one haploid plant selected in said breeding program to obtain a doubled haploid plant.
 9. The method of claim 9, wherein the doubled haploid plant is used for introgression of the genotype of interest into at least a second plant for use in a plant breeding program.
 10. The method of claim 1, wherein said haploid plant in step (a) is obtained from a haploid breeding population.
 11. A method for identifying an association of a plant genotype with one or more traits of interest comprising: a) screening a plurality of haploid plants displaying heritable variation for at least one trait wherein the heritable variation is linked to at least one genotype; and b) associating at least one genotype of at least one haploid plant to at least one trait.
 12. The method of claim 11, wherein the genotype comprises a genetic marker.
 13. The method of claim 11, further comprising the step of using an association determined in step (b) to make a selection in a plant breeding program.
 14. The method of claim 13, wherein said selection comprises selecting among breeding populations based on the at least one genotype.
 15. The method of claim 13, wherein said selection comprises selecting progeny in one or more breeding populations based on the at least one genotype.
 16. The method of claim 13, wherein said selection comprises selecting among parental lines based on prediction of progeny performance.
 17. The method of claim 13, wherein said selection comprises selection of a line for advancement in a germplasm improvement activity based on the at least one genotype.
 18. The method of claim 13, further comprising the step of doubling at least one haploid plant selected in said breeding program to obtain a doubled haploid plant.
 19. The method of claim 18, wherein the doubled haploid plant is used for introgressing the genotype of interest into a plant for use in a plant breeding program.
 20. The method of claim 11, wherein said haploid plant or plants comprise an intact plant, a leaf, vascular tissue, flower, pod, root, stem, seed or portion thereof.
 21. The method of claim 11, wherein the plants are selected from the group consisting of maize (Zea mays), soybean (Glycine max), cotton (Gossypium hirsutum), peanut (Arachis hypogaea), barley (Hordeum vulgare); oats (Avena sativa); orchard grass (Dactylis glomerata); rice (Oryza sativa, including indica and japonica varieties); sorghum (Sorghum bicolor); sugar cane (Saccharum sp); tall fescue (Festuca arundinacea); turfgrass species (e.g. species: Agrostis stolonifera, Poa pratensis, Stenotaphrum secundatum); wheat (Triticum aestivum), and alfalfa (Medicago sativa), members of the genus Brassica, carrot, cucumber, dry bean, eggplant, fennel, garden beans, gourd, leek, lettuce, melon, okra, onion, pea, pepper, pumpkin, radish, spinach, squash, sweet corn, tomato, watermelon, and ornamental plants.
 22. The method of claim 11, wherein the haploid plant is a fruit, vegetable, tuber, or root crop.
 23. The method of claim 11, wherein the trait is selected from the group consisting of herbicide tolerance, disease resistance, insect or pest resistance, altered fatty acid, protein or carbohydrate metabolism, increased grain yield, increased oil, enhanced nutritional content, increased growth rates, enhanced stress tolerance, preferred maturity, enhanced organoleptic properties, altered morphological characteristics, sterility, a trait for industrial use, and a trait for consumer appeal.
 24. A method for the association of at least one phenotype with at least one genetic marker using a haploid plant comprising: a) assaying at least one phenotype of at least one haploid plant with at least one phenotypic marker to determine the presence or absence of said phenotype; and b) associating the presence or absence of said phenotype with at least one genetic marker.
 25. The method of claim 24, wherein said haploid plant is obtained from a haploid breeding population.
 26. The method of claim 24, wherein the at least one genetic marker comprises a single nucleotide polymorphism (SNP), an insertion or deletion in DNA sequence (Indel), a simple sequence repeat of DNA sequence (SSR) a restriction fragment length polymorphism, a haplotype, or a tag SNP.
 27. The method of claim 24, wherein the at least one phenotypic marker comprises at least one of a transcriptional profile, a metabolic profile, a nutrient composition profile, a protein expression profile, protein composition, protein levels, oil composition, oil levels, carbohydrate composition, carbohydrate levels, fatty acid composition, fatty acid levels, amino acid composition, amino acid levels, biopolymers, pharmaceuticals, starch composition, starch levels, fermentable starch, fermentation yield, fermentation efficiency, energy yield, secondary compounds, metabolites, morphological characteristics, or an agronomic characteristic.
 28. The method of claim 24, further comprising the step of using an association determined in step (b) to make a selection in a plant breeding program.
 29. The method of claim 28, further comprising the step of doubling at least one haploid plant selected in said breeding program to obtain a doubled haploid plant.
 30. The method of claim 29, wherein the doubled haploid plant is used for introgression of the genotype of interest into at least a second plant for use in a plant breeding program. 